Overlapping sound event recognition using local spectrogram features and the generalised hough transform
نویسندگان
چکیده
In this paper, we address the challenging task of simultaneous recognition of overlapping sound events from single channel audio. Conventional frame-based methods aren’t well suited to the problem, as each time frame contains a mixture of information from multiple sources. Missing feature masks are able to improve the recognition in such cases, but are limited by the accuracy of the mask, which is a non-trivial problem. In this paper, we propose an approach based on Local Spectrogram Features (LSFs) which represent local spectral information that is extracted from the two-dimensional region surrounding “keypoints” detected in the spectrogram. The keypoints are designed to locate the sparse, discriminative peaks in the spectrogram, such that we can model sound events through a set of representative LSF clusters and their occurrences in the spectrogram. To recognise overlapping sound events, we use a Generalised Hough Transform (GHT) voting system, which sums the information over many independent keypoints to produce onset hypotheses, that can detect any arbitrary combination of sound events in the spectrogram. Each hypothesis is then scored against the class distribution models to recognise the existence of the sound in the spectrogram. Experiments on a set of five overlapping sound events, in the presence of non-stationary background noise, demonstrates the potential of our approach.
منابع مشابه
Overlapping Sound Event Recognition using Local Spectrogram Features with the Generalised Hough Transform
We present a novel approach for recognition of overlapping sound events based on the Generalised Hough Transform (GHT) – a technique commonly used for object recognition in the domain of image processing. Unlike our previous work on image-based sound event classification, where we focussed on global image features, here we extract local features from detected interest-points in the spectrogram....
متن کاملLarge scale continuous visual event recognition using max-margin Hough transformation framework
In this paper we propose a novel method for continuous visual event recognition (CVER) on a large scale video dataset using max-margin Hough transformation framework. Due to high scalability, diverse real environmental state and wide scene variability direct application of action recognition/detection methods such as spatio-temporal interest point (STIP)-local feature based technique, on the wh...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملSpiking neural networks and the generalised hough transform for speech pattern detection
This paper proposes a novel spiking neural network (SNN) architecture that integrates with the generalised Hough transform (GHT) framework for the task of detecting specific speech patterns such as command words. The idea is that the GHT can model the geometrical distribution of speech information over the wider temporal context, while the SNN to used learn the discriminative prior weighting in...
متن کاملAn improved generalized Hough transform overlapping objects
The generalized Hough transform (GHT) is a powerful method for recognizing arbitrary shapes as long as the correct match accounts for both much of the model and much of the sensory object. For moderate levels of occlusion, however, the GHT can hypothesize many false solutions. In this paper, we present an improved two-stage GHT procedure for the recognition of overlapping objects. Each boundary...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition Letters
دوره 34 شماره
صفحات -
تاریخ انتشار 2013